Search CORE

44 research outputs found

Stabilizing reinforcement learning control: A modular framework for optimizing over all stable behavior

Author: Forbes Michael G.
Gopaluni R. Bhushan
Lawrence Nathan P.
Loewen Philip D.
Wang Shuyuan
Publication venue
Publication date: 21/10/2023
Field of study

We propose a framework for the design of feedback controllers that combines the optimization-driven and model-free advantages of deep reinforcement learning with the stability guarantees provided by using the Youla-Kucera parameterization to define the search domain. Recent advances in behavioral systems allow us to construct a data-driven internal model; this enables an alternative realization of the Youla-Kucera parameterization based entirely on input-output exploration data. Perhaps of independent interest, we formulate and analyze the stability of such data-driven models in the presence of noise. The Youla-Kucera approach requires a stable "parameter" for controller design. For the training of reinforcement learning agents, the set of all stable linear operators is given explicitly through a matrix factorization approach. Moreover, a nonlinear extension is given using a neural network to express a parameterized set of stable operators, which enables seamless integration with standard deep learning libraries. Finally, we show how these ideas can also be applied to tune fixed-structure controllers.Comment: Preprint; 18 pages. arXiv admin note: text overlap with arXiv:2304.0342

arXiv.org e-Print Archive

Reinforcement Learning with Partial Parametric Model Knowledge

Author: Forbes Michael G.
Gopaluni R. Bhushan
Lawrence Nathan P.
Loewen Philip D.
Wang Shuyuan
Publication venue
Publication date: 25/04/2023
Field of study

We adapt reinforcement learning (RL) methods for continuous control to bridge the gap between complete ignorance and perfect knowledge of the environment. Our method, Partial Knowledge Least Squares Policy Iteration (PLSPI), takes inspiration from both model-free RL and model-based control. It uses incomplete information from a partial model and retains RL's data-driven adaption towards optimal performance. The linear quadratic regulator provides a case study; numerical experiments demonstrate the effectiveness and resulting benefits of the proposed method.Comment: IFAC World Congress 202

arXiv.org e-Print Archive

Meta-Reinforcement Learning for the Tuning of PI Controllers: An Offline Approach

Author: Backstrom Johan U.
Forbes Michael G.
Gopaluni R. Bhushan
Lawrence Nathan P.
Loewen Philip D.
McClement Daniel G.
Publication venue: 'Elsevier BV'
Publication date: 19/09/2022
Field of study

Meta-learning is a branch of machine learning which trains neural network models to synthesize a wide variety of data in order to rapidly solve new problems. In process control, many systems have similar and well-understood dynamics, which suggests it is feasible to create a generalizable controller through meta-learning. In this work, we formulate a meta reinforcement learning (meta-RL) control strategy that can be used to tune proportional--integral controllers. Our meta-RL agent has a recurrent structure that accumulates "context" to learn a system's dynamics through a hidden state variable in closed-loop. This architecture enables the agent to automatically adapt to changes in the process dynamics. In tests reported here, the meta-RL agent was trained entirely offline on first order plus time delay systems, and produced excellent results on novel systems drawn from the same distribution of process dynamics used for training. A key design element is the ability to leverage model-based information offline during training in simulated environments while maintaining a model-free policy structure for interacting with novel processes where there is uncertainty regarding the true process dynamics. Meta-learning is a promising approach for constructing sample-efficient intelligent controllers.Comment: 23 pages; postprin

arXiv.org e-Print Archive

Optical Lattices for Atom Based Quantum Microscopy

Author: Andreas Klinger
Cheng Chin
Jessen P. S.
Kathy-Anne Brickman Soderberg
Loewen E. G.
Nathan Gemelke
Skyler Degenkolb
Publication venue: 'AIP Publishing'
Publication date: 14/09/2009
Field of study

We describe new techniques in the construction of optical lattices to realize a coherent atom-based microscope, comprised of two atomic species used as target and probe atoms, each in an independently controlled optical lattice. Precise and dynamic translation of the lattices allows atoms to be brought into spatial overlap to induce atomic interactions. For this purpose, we have fabricated two highly stable, hexagonal optical lattices, with widely separted wavelengths but identical lattice constants using diffractive optics. The relative translational stability of 12nm permits controlled interactions and even entanglement operations with high fidelity. Translation of the lattices is realized through a monolithic electro-optic modulator array, capable of moving the lattice smoothly over one lattice site in 11 microseconds, or rapidly on the order of 100 nanoseconds.Comment: 7 pages, 9 figure

arXiv.org e-Print Archive

Crossref

Meta-Reinforcement Learning for Adaptive Control of Second Order Systems

Author: Backström Johan U.
Forbes Michael G.
Gopaluni R. Bhushan
Lawrence Nathan P.
Loewen Philip D.
McClement Daniel G.
Publication venue
Publication date: 19/09/2022
Field of study

Meta-learning is a branch of machine learning which aims to synthesize data from a distribution of related tasks to efficiently solve new ones. In process control, many systems have similar and well-understood dynamics, which suggests it is feasible to create a generalizable controller through meta-learning. In this work, we formulate a meta reinforcement learning (meta-RL) control strategy that takes advantage of known, offline information for training, such as a model structure. The meta-RL agent is trained over a distribution of model parameters, rather than a single model, enabling the agent to automatically adapt to changes in the process dynamics while maintaining performance. A key design element is the ability to leverage model-based information offline during training, while maintaining a model-free policy structure for interacting with new environments. Our previous work has demonstrated how this approach can be applied to the industrially-relevant problem of tuning proportional-integral controllers to control first order processes. In this work, we briefly reintroduce our methodology and demonstrate how it can be extended to proportional-integral-derivative controllers and second order systems.Comment: AdCONIP 2022. arXiv admin note: substantial text overlap with arXiv:2203.0966

arXiv.org e-Print Archive

A mid-Cretaceous tyrannosauroid and the origin of North American end-Cretaceous dinosaur assemblages

Author: Brusatte Stephen
Denton Jr Robert K.
Kirkland James I.
Loewen Mark A.
McDonald Andrew T.
Nesbitt Sterling J.
Smith Nathan D.
Turner Alan H.
Wolfe Douglas G.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/05/2019
Field of study

Edinburgh Research Explorer